53 research outputs found
An Abstract Method Linearization for Detecting Source Code Plagiarism in Object-Oriented Environment
Despite the fact that plagiarizing source code is a trivial task for most CS
students, detecting such unethical behavior requires a considerable amount of
effort. Thus, several plagiarism detection systems were developed to handle
such issue. This paper extends Karnalim's work, a low-level approach for
detecting Java source code plagiarism, by incorporating abstract method
linearization. Such extension is incorporated to enhance the accuracy of
low-level approach in term of detecting plagiarism in object-oriented
environment. According to our evaluation, which was conducted based on 23
design-pattern source code pairs, our extended low-level approach is more
effective than state-of-the-art and Karnalim's approach. On the one hand, when
compared to state-of-the-art approach, our approach can generate less
coincidental similarities and provide more accurate result. On the other hand,
when compared to Karnalim's approach, our approach, at some extent, can
generate higher similarity when simple abstract method invocation is
incorporated.Comment: The 8th International Conference on Software Engineering and Service
Scienc
The Effectiveness of Low-Level Structure-based Approach Toward Source Code Plagiarism Level Taxonomy
Low-level approach is a novel way to detect source code plagiarism. Such
approach is proven to be effective when compared to baseline approach (i.e., an
approach which relies on source code token subsequence matching) in controlled
environment. We evaluate the effectiveness of state of the art in low-level
approach based on Faidhi \& Robinson's plagiarism level taxonomy; real
plagiarism cases are employed as dataset in this work. Our evaluation shows
that state of the art in low-level approach is effective to handle most
plagiarism attacks. Further, it also outperforms its predecessor and baseline
approach in most plagiarism levels.Comment: The 6th International Conference on Information and Communication
Technolog
Dynamic Thresholding Mechanisms for IR-Based Filtering in Efficient Source Code Plagiarism Detection
To solve time inefficiency issue, only potential pairs are compared in
string-matching-based source code plagiarism detection; wherein potentiality is
defined through a fast-yet-order-insensitive similarity measurement (adapted
from Information Retrieval) and only pairs which similarity degrees are higher
or equal to a particular threshold is selected. Defining such threshold is not
a trivial task considering the threshold should lead to high efficiency
improvement and low effectiveness reduction (if it is unavoidable). This paper
proposes two thresholding mechanisms---namely range-based and pair-count-based
mechanism---that dynamically tune the threshold based on the distribution of
resulted similarity degrees. According to our evaluation, both mechanisms are
more practical to be used than manual threshold assignment since they are more
proportional to efficiency improvement and effectiveness reduction.Comment: The 2018 International Conference on Advanced Computer Science and
Information Systems (ICACSIS
TF-IDF Inspired Detection for Cross-Language Source Code Plagiarism and Collusion
Several computing courses allow students to choose which programming language they want to use for completing a programming task. This can lead to cross-language code plagiarism and collusion, in which the copied code file is rewritten in another programming language. In response to that, this paper proposes a detection technique which is able to accurately compare code files written in various programming languages, but with limited effort in accommodating such languages at development stage. The only language-dependent feature used in the technique is source code tokeniser and no code conversion is applied. The impact of coincidental similarity is reduced by applying a TF-IDF inspired weighting, in which rare matches are prioritised. Our evaluation shows that the technique outperforms common techniques in academia for handling language conversion disguises. Further, it is comparable to those techniques when dealing with conventional disguises
Improving Scalability of Java Archive Search Engine Through Recursion Conversion and Multithreading
Based on the fact that bytecode always exists on Java archive, a bytecode based Java archive search engine had been developed [1, 2]. Although this system is quite effective, it still lack of scalability since many modules apply recursive calls and this system only utilizes one core (single thread). In this research, Java archive search engine architecture is redesigned in order to improve its scalability. All recursion are converted to iterative forms although most of these modules are logically recursive and quite difficult to convert (e.g. Tarjan's strongly connected component algorithm). Recursion conversion can be conducted by following its respective recursive pattern. Each recursion is broke down to four parts (before and after actions of current and its children) and converted to iteration with the help of caller reference. This conversion mechanism improves scalability by avoiding stack overflow error caused by method calls. System scalability is also improved by applying multithreading mechanism which successfully cut off its processing time. Shorter processing time may enable system to handle larger data. Multithreading is applied on major parts which are indexer, vector space model (VSM) retriever, low-rank vector space model (LRVSM) retriever, and semantic relatedness calculator (semantic relatedness calculator also involves multiprocess). The correctness of both recursion conversion and multithread design are proved by the fact that all implementation yield similar result
Extended Vector Space Model with Semantic Relatedness on Java Archive Search Engine
Byte code as information source is a novel approach which enable Java archive search engine to be built without relying on another resources except the Java archive itself [1]. Unfortunately, its effectiveness is not considerably high since some relevant documents may not be retrieved because of vocabulary mismatch. In this research, a vector space model (VSM) is extended with semantic relatedness to overcome vocabulary mismatch issue in Java archive search engine. Aiming the most effective retrieval model, some sort of equations in retrieval models are also proposed and evaluated such as sum up all related term, substituting non-existing term with most related term, logaritmic normalization, context-specific relatedness, and low-rank query-related retrieved documents. In general, semantic relatedness improves recall as a tradeoff of its precision reduction. We also proposed a scheme to take the advantage of relatedness without affected by its disadvantage (VSM + considering non-retrieved documents as low-rank retrieved documents using semantic relatedness). This scheme assures that relatedness score should be ranked lower than standard exact-match score. This scheme yields 1.754% higher effectiveness than our standard VSM
The Use of Python Tutor on Programming Laboratory Session: Student Perspectives
Based on the fact that the impact of educational tools can only be accurately measured through student-centered evaluation, this paper proposes a long-term in-class evaluation for Python Tutor, a program visualization tool developed by Guo. The evaluation involves 53 students from 4 Basic Data Structure classes, which were held in the even semester of 2016/2017 academic year. It is conducted based on questionnaire survey asked to the students after they have used Python Tutor in their half of programming laboratory sessions. In general, there are three findings from this work. Firstly, Python Tutor helps students to complete programming laboratory tasks, specifically for Basic Data Structure material. Secondly, Python Tutor helps students to understand general programming aspects which are execution flow, variable content change, method invocation sequence, object reference, syntax error, and logic error. Finally, based on student perspectives, Python Tutor is a helpful tool positively affecting the students
Complexitor: an Educational Tool for Learning Algorithm TIME Complexity in Practical Manner
Based on the informal survey, learning algorithm time complexity in a theoretical manner can be rather difficult to understand. Therefore, this research proposed Complexitor, an educational tool for learning algorithm time complexity in a practical manner. Students could learn how to determine algorithm time complexity through the actual execution of algorithm implementation. They were only required to provide algorithm implementation (i.e. source code written on a particularprogramming language) and test cases to learn time complexity. After input was given, Complexitor generated execution sequence based on test cases and determine its time complexity through Pearson correlation. An algorithm time complexity with the highest correlation value toward execution sequence was assigned as its result. Based on the evaluation, it can be concluded this mechanism is quite effective for determining time complexity as long as the distribution of given input set is balanced
AP-ASD1 : an Indonesian Desktop-based Educational Tool for Basic Data Structure Course
Although there are so many avalaible data structure educational tools, it is quite difficult to find a suitable tool to aid students for learning certain course [1]. Several major impediments in determining the tool are teaching preferences, language barrier, confusing terminologies, internet dependency, various degree of material difficulty, and other environment aspects. In this research, a data structure educational tool called AP-ASD1 is developed based on basic algorithm and data structure course (ASD 1). Since AP-ASD1 is developed following course materials and not vice versa, this educational tool is guaranteed to fit in our needs. The feasibility of AP-ASD1 is evaluated based on two factors which are functionality correctness and survey. All features are correctly functioned and yield expected output whereas survey yields fairly good result (84,305% achievement rate). Based on our survey, AP-ASD1 meets eligibility standard and its features are also successfully integrated. Survey also concludes that this application is also quite effective as a supportive tool for learning basic data structure
- …